90 research outputs found

    Unbiased Watermark for Large Language Models

    Full text link
    The recent advancements in large language models (LLMs) have sparked a growing apprehension regarding the potential misuse. One approach to mitigating this risk is to incorporate watermarking techniques into LLMs, allowing for the tracking and attribution of model outputs. This study examines a crucial aspect of watermarking: how significantly watermarks impact the quality of model-generated outputs. Previous studies have suggested a trade-off between watermark strength and output quality. However, our research demonstrates that it is possible to integrate watermarks without affecting the output probability distribution with appropriate implementation. We refer to this type of watermark as an unbiased watermark. This has significant implications for the use of LLMs, as it becomes impossible for users to discern whether a service provider has incorporated watermarks or not. Furthermore, the presence of watermarks does not compromise the performance of the model in downstream tasks, ensuring that the overall utility of the language model is preserved. Our findings contribute to the ongoing discussion around responsible AI development, suggesting that unbiased watermarks can serve as an effective means of tracking and attributing model outputs without sacrificing output quality

    When Source-Free Domain Adaptation Meets Label Propagation

    Full text link
    Source-free domain adaptation, where only a pre-trained source model is used to adapt to the target distribution, is a more general approach to achieving domain adaptation. However, it can be challenging to capture the inherent structure of the target features accurately due to the lack of supervised information on the target domain. To tackle this problem, we propose a novel approach called Adaptive Local Transfer (ALT) that tries to achieve efficient feature clustering from the perspective of label propagation. ALT divides the target data into inner and outlier samples based on the adaptive threshold of the learning state, and applies a customized learning strategy to best fits the data property. Specifically, inner samples are utilized for learning intra-class structure thanks to their relatively well-clustered properties. The low-density outlier samples are regularized by input consistency to achieve high accuracy with respect to the ground truth labels. In this way, local clustering can be prevented from forming spurious clusters while effectively propagating label information among subpopulations. Empirical evidence demonstrates that ALT outperforms the state of the arts on three public benchmarks: Office-31, Office-Home, and VisDA

    Leveraging Foundation Models to Improve Lightweight Clients in Federated Learning

    Full text link
    Federated Learning (FL) is a distributed training paradigm that enables clients scattered across the world to cooperatively learn a global model without divulging confidential data. However, FL faces a significant challenge in the form of heterogeneous data distributions among clients, which leads to a reduction in performance and robustness. A recent approach to mitigating the impact of heterogeneous data distributions is through the use of foundation models, which offer better performance at the cost of larger computational overheads and slower inference speeds. We introduce foundation model distillation to assist in the federated training of lightweight client models and increase their performance under heterogeneous data settings while keeping inference costs low. Our results show improvement in the global model performance on a balanced testing set, which contains rarely observed samples, even under extreme non-IID client data distributions. We conduct a thorough evaluation of our framework with different foundation model backbones on CIFAR10, with varying degrees of heterogeneous data distributions ranging from class-specific data partitions across clients to dirichlet data sampling, parameterized by values between 0.01 and 1.0.Comment: 6 Pages + Appendice

    Huatuo-26M, a Large-scale Chinese Medical QA Dataset

    Full text link
    In this paper, we release a largest ever medical Question Answering (QA) dataset with 26 million QA pairs. We benchmark many existing approaches in our dataset in terms of both retrieval and generation. Experimental results show that the existing models perform far lower than expected and the released dataset is still challenging in the pre-trained language model era. Moreover, we also experimentally show the benefit of the proposed dataset in many aspects: (i) trained models for other QA datasets in a zero-shot fashion; and (ii) as external knowledge for retrieval-augmented generation (RAG); and (iii) improving existing pre-trained language models by using the QA pairs as a pre-training corpus in continued training manner. We believe that this dataset will not only contribute to medical research but also facilitate both the patients and clinical doctors. See \url{https://github.com/FreedomIntelligence/Huatuo-26M}

    Predicting 1-, 3-, 5-, and 8-year all-cause mortality in a community-dwelling older adult cohort: relevance for predictive, preventive, and personalized medicine

    Get PDF
    Background: Population aging is a global public health issue involving increased prevalence of age-related diseases, and concomitant burden on medical resources and the economy. Ninety-two diseases have been identified as age-related, accounting for 51.3% of the global adult disease burden. The economic cost per capita for older people over 60 years is 10 times that of the younger population. From the aspects of predictive, preventive, and personalized medicine (PPPM), developing a risk-prediction model can help identify individuals at high risk for all-cause mortality and provide an opportunity for targeted prevention through personalized intervention at an early stage. However, there is still a lack of predictive models to help community-dwelling older adults do well in healthcare. Objectives: This study aims to develop an accurate 1-, 3-, 5-, and 8-year all-cause mortality risk-prediction model by using clinical multidimensional variables, and investigate risk factors for 1-, 3-, 5-, and 8-year all-cause mortality in community-dwelling older adults to guide primary prevention. Methods: This is a two-center cohort study. Inclusion criteria: (1) community-dwelling adult, (2) resided in the districts of Chaonan or Haojiang for more than 6 months in the past 12 months, and (3) completed a health examination. Exclusion criteria: (1) age less than 60 years, (2) more than 30 incomplete variables, (3) no signed informed consent. The primary outcome of the study was all-cause mortality obtained from face-to-face interviews, telephone interviews, and the medical death database from 2012 to 2021. Finally, we enrolled 5085 community-dwelling adults, 60 years and older, who underwent routine health screening in the Chaonan and Haojiang districts, southern China, from 2012 to 2021. Of them, 3091 participants from Chaonan were recruited as the primary training and internal validation study cohort, while 1994 participants from Haojiang were recruited as the external validation cohort. A total of 95 clinical multidimensional variables, including demographics, lifestyle behaviors, symptoms, medical history, family history, physical examination, laboratory tests, and electrocardiogram (ECG) data were collected to identify candidate risk factors and characteristics. Risk factors were identified using least absolute shrinkage and selection operator (LASSO) models and multivariable Cox proportional hazards regression analysis. A nomogram predictive model for 1-, 3-, 5- and 8-year all-cause mortality was constructed. The accuracy and calibration of the nomogram prediction model were assessed using the concordance index (C-index), integrated Brier score (IBS), receiver operating characteristic (ROC), and calibration curves. The clinical validity of the model was assessed using decision curve analysis (DCA). Results: Nine independent risk factors for 1-, 3-, 5-, and 8-year all-cause mortality were identified, including increased age, male, alcohol status, higher daily liquor consumption, history of cancer, elevated fasting glucose, lower hemoglobin, higher heart rate, and the occurrence of heart block. The acquisition of risk factor criteria is low cost, easily obtained, convenient for clinical application, and provides new insights and targets for the development of personalized prevention and interventions for high-risk individuals. The areas under the curve (AUC) of the nomogram model were 0.767, 0.776, and 0.806, and the C-indexes were 0.765, 0.775, and 0.797, in the training, internal validation, and external validation sets, respectively. The IBS was less than 0.25, which indicates good calibration. Calibration and decision curves showed that the predicted probabilities were in good agreement with the actual probabilities and had good clinical predictive value for PPPM. Conclusion: The personalized risk prediction model can identify individuals at high risk of all-cause mortality, help offer primary care to prevent all-cause mortality, and provide personalized medical treatment for these high-risk individuals from the PPPM perspective. Strict control of daily liquor consumption, lowering fasting glucose, raising hemoglobin, controlling heart rate, and treatment of heart block could be beneficial for improving survival in elderly populations

    MM-wave wide-scan cylindrical dielectric lens antennas

    No full text

    Rock Mass Quality Evaluation Based on Unascertained Measure and Intuitionistic Fuzzy Sets

    No full text
    Evaluation of rock mass quality is of great significance to the design and construction of geotechnical engineering. In order to evaluate the quality of engineering rock mass scientifically and deal with the fuzzy information in the rock mass quality evaluation reasonably, a model for evaluation of rock mass quality based on unascertained measure and intuitionistic fuzzy sets (UM-IFS) was proposed. First, the membership of rock mass quality evaluation index was determined by the single index measure function of unascertained measure (UM) theory. Based on the intuitionistic fuzzy sets (IFS) theory, the single index measure evaluation matrix based on IFS (IFS-single index measure evaluation matrix) was obtained. By synthesizing various subjective and objective weighting methods, the range of index weight was determined, and the index weight vector based on IFS (IFS-index weight vector) was constructed. Then, the IFS-single index measure evaluation matrix and the IFS-index weight vector were used to calculate the scores of rock mass samples and evaluate rock mass quality. Finally, fuzzy analysis was performed on the weight of rock mass quality evaluation index. The established model for evaluation of rock mass quality was applied to the underground engineering rock mass in Guangzhou pumped storage power plant, and the evaluation results were compared with the other 4 effective models for rock mass quality evaluation. The results show that rock mass quality evaluation based on UM-IFS is consistent with the actual situation, and the fuzziness of evaluation index weight has no obvious correlation with its value

    Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

    No full text
    The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold). Although many optimization algorithms for minimax problems have been developed in the Euclidean setting, it is difficult to convert them into Riemannian cases, and algorithms for nonconvex minimax problems with nonconvex constraints are even rare. On the other hand, to address the big data challenges, decentralized (serverless) training techniques have recently been emerging since they can reduce communications overhead and avoid the bottleneck problem on the server node. Nonetheless, the algorithm for decentralized Riemannian minimax problems has not been studied. In this paper, we study the distributed nonconvex-strongly-concave minimax optimization problem over the Stiefel manifold and propose both deterministic and stochastic minimax methods. The Steifel manifold is a non-convex set. The global function is represented as the finite sum of local functions. For the deterministic setting, we propose DRGDA and prove that our deterministic method achieves a gradient complexity of O( epsilon(-2)) under mild conditions. For the stochastic setting, we propose DRSGDA and prove that our stochastic method achieves a gradient complexity of O( epsilon(-4)). The DRGDA and DRSGDA are the first algorithms for distributed minimax optimization with nonconvex constraints with exact convergence. Extensive experimental results on the Deep Neural Networks (DNNs) training over the Stiefel manifold demonstrate the efficiency of our algorithms
    • …
    corecore